Tutorial Part VI — Population analysis in R
Introduction
This is a brief tutorial about constructing and analyzing relatively simple Cormack-Jolly-Seber capture-mark-recapture (CMR) models in R. For this we use the R package “marked” (Laake et al. 2013).
First we get an overview of the data, then we develop different models and finally we analyse and visualize the results.
Setup
Load (and install) the following packages:
package.list=c("here",
"marked",
"skimr",
"sf",
"tmap",
"devtools",
"rnaturalearthdata",
"ggplot2",
"readr",
"tidyr")
for (package in package.list) {
if (!require(package, character.only=T, quietly=T)) {
install.packages(package) ##take out for knitting
library(package, character.only=T)
}
}set the theme for some ggplots:
Load and manipulate data
Here, we want to directly set some columns as factors for all subsequent analyses:
starlings <- readr::read_csv(here("data", "data_berlin", "animal_data", "starlings.csv"))
starlings$sex <- as.factor(starlings$sex) # convert to factor
starlings$habitat <- as.factor(starlings$habitat) # convert to factor
starlings$infected <- as.factor(starlings$infected) # convert to factor
names(starlings)[1] <- "ID" # rename first column to ID column
# Short overview
skim(starlings)| Name | starlings |
| Number of rows | 605 |
| Number of columns | 5 |
| _______________________ | |
| Column type frequency: | |
| character | 1 |
| factor | 3 |
| numeric | 1 |
| ________________________ | |
| Group variables | None |
Variable type: character
| skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
|---|---|---|---|---|---|---|---|
| ch | 0 | 1 | 7 | 7 | 0 | 32 | 0 |
Variable type: factor
| skim_variable | n_missing | complete_rate | ordered | n_unique | top_counts |
|---|---|---|---|---|---|
| sex | 0 | 1.00 | FALSE | 2 | Fem: 323, Mal: 282 |
| habitat | 0 | 1.00 | FALSE | 2 | urb: 311, rur: 294 |
| infected | 202 | 0.67 | FALSE | 2 | Yes: 214, No: 189 |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| ID | 0 | 1 | 303 | 174.79 | 1 | 152 | 303 | 454 | 605 | ▇▇▇▇▇ |
Each row in this table represents one individual. We have 5 columns in this dataset: ID, the individual identifier, ch, the capture history (0 = undetected, 1 = detected),the sex of the captured individuals, the habitat type the birds were captured, and health status, i.e. if birds were infected
Explore capture locations
#load environmental data on capturing
locations <-
read.csv(here(
"data",
"data_berlin",
"animal_data",
"starlings_capture_location_data.csv"
))
# Short overview
skim(locations)| Name | locations |
| Number of rows | 2 |
| Number of columns | 10 |
| _______________________ | |
| Column type frequency: | |
| character | 1 |
| numeric | 9 |
| ________________________ | |
| Group variables | None |
Variable type: character
| skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
|---|---|---|---|---|---|---|---|
| Habitat | 0 | 1 | 5 | 5 | 0 | 2 | 0 |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| Lon | 0 | 1 | 13.57 | 0.19 | 13.44 | 13.51 | 13.57 | 13.64 | 13.71 | ▇▁▁▁▇ |
| Lat | 0 | 1 | 52.44 | 0.05 | 52.40 | 52.42 | 52.44 | 52.45 | 52.47 | ▇▁▁▁▇ |
| temperature_2016 | 0 | 1 | 30.25 | 1.06 | 29.50 | 29.88 | 30.25 | 30.62 | 31.00 | ▇▁▁▁▇ |
| temperature_2017 | 0 | 1 | 39.00 | 5.66 | 35.00 | 37.00 | 39.00 | 41.00 | 43.00 | ▇▁▁▁▇ |
| temperature_2018 | 0 | 1 | 33.50 | 3.54 | 31.00 | 32.25 | 33.50 | 34.75 | 36.00 | ▇▁▁▁▇ |
| temperature_2019 | 0 | 1 | 28.50 | 2.12 | 27.00 | 27.75 | 28.50 | 29.25 | 30.00 | ▇▁▁▁▇ |
| temperature_2020 | 0 | 1 | 28.50 | 0.71 | 28.00 | 28.25 | 28.50 | 28.75 | 29.00 | ▇▁▁▁▇ |
| temperature_2021 | 0 | 1 | 30.00 | 1.41 | 29.00 | 29.50 | 30.00 | 30.50 | 31.00 | ▇▁▁▁▇ |
| temperature_2022 | 0 | 1 | 29.50 | 1.41 | 28.50 | 29.00 | 29.50 | 30.00 | 30.50 | ▇▁▁▁▇ |
#transform our locations into a spatial sf-object
loc_sf <- st_as_sf(locations, coords = c("Lon", "Lat"), crs = 4326)
#plot the locations
d6berlin::base_map_imp(globe = TRUE, resolution = 500)+
geom_sf(data=st_transform(loc_sf, crs=3035),aes(col=Habitat),size=5)+
guides(color = guide_legend(direction = "horizontal",
title.position = "top",
title.hjust = .5))Plot the capturing data
# Capture histories
ggplot(data = starlings, mapping = aes(x = ch)) +
geom_bar(colour = "#262674", fill = "#262674") +
scale_y_continuous(
limits = c(0, 40),
breaks = seq(0, 40, by = 10),
expand = c(.001, .001)
) +
labs(x = "capture histories") +
theme(axis.text.x = element_text(angle = 45))# Sex
ggplot(data = starlings,mapping = aes(x = sex, fill = sex)) +
geom_bar() +
scale_fill_manual(values = c(Female = "salmon", Male = "lightblue"))# Sex and capture histories
ggplot(data = starlings, mapping = aes(x = ch, fill = sex)) +
geom_bar(position = "stack") +
scale_y_continuous(
limits = c(0, 40),
breaks = seq(0, 40, by = 10),
expand = c(.001, .001)
) +
labs(x = "capture histories", fill = "sex") +
theme(axis.text.x = element_text(angle = 45)) +
scale_fill_manual(values = c(Female = "salmon", Male = "lightblue"))Plot environmental data
#data can be in wide or long format, often long-format is preferred
locations_long <- pivot_longer(
data = locations,
names_to = "year",
values_to = "temp",
cols = starts_with("Temp"))
# check max. temperatures between habitats
ggplot(locations_long, aes(x = year, y = temp, group = Habitat)) +
geom_line(aes(col = Habitat), linewidth = 2) +
scale_colour_manual(values = c(urban = "darkgoldenrod", rural = "darkblue"))+
theme(axis.text.x = element_text(angle = 45,hjust=0.5,vjust=0.5))Basic Cormack-Jolly-Seber models
The default model uses constant detection and survival probabilities and constant time intervals between captures. We fit the model and have a look at the results. Please note: The estimates are on a logit scale
#models require data frames, tibbles produce errors.
starlings <- as.data.frame(starlings)
#apply basic crm model
cjs_m1 <- crm(data = starlings)
#take a look
cjs_m1##
## crm Model Summary
##
## Npar : 2
## -2lnL: 1291.825
## AIC : 1295.825
##
## Beta
## Estimate
## Phi.(Intercept) 0.06757843
## p.(Intercept) 2.22899510
Npar = number of parameters -2lnL = 2 log(Likelihood) of the fitted model AIC = Akaikes Information Criterion Phi = Survival probability between capture events p = capture probability per capture event
Estimate parameters on real scale
Write a function to assess results more intuitively on the real scale instead of the logit-scale. We can just paste our results from the previous model output
# write function
inverse_logit <- function(x) {
exp(x) / (1 + exp(x))
}
#e.g.
inverse_logit(0.06757844) # this is our survival probability## [1] 0.5168882
## [1] 0.9028232
## (Intercept)
## 0.5168882
## (Intercept)
## 0.9028232
Refit the model with confidence intervals
##
## crm Model Summary
##
## Npar : 2
## -2lnL: 1291.825
## AIC : 1295.825
##
## Beta
## Estimate se lcl ucl
## Phi.(Intercept) 0.06757843 0.07249005 -0.07450207 0.2096589
## p.(Intercept) 2.22899510 0.24951951 1.73993685 2.7180533
Predict
Now we transform the logit estimates to real estimates by using the inverse logit
## $Phi
## occ estimate se lcl ucl
## 1 1 0.5168882 0.01810184 0.4813831 0.5522236
##
## $p
## occ estimate se lcl ucl
## 1 2 0.9028232 0.02189121 0.850679 0.9380836
What does that mean? The survival probability between capture events was 0.52. The probability to capture an individual during a capture event was 0.90. Let’s visualize this in a plot:
# store prediction as object
mypred_list <- predict(object = cjs_m1_2,
newdata = data.frame(sex = c("Female", "Male")),
se = TRUE)
#mypred_list ## this is a list. Convert into data.frame:
mypred_df <- do.call(rbind, mypred_list)
## add the rownames phi and p as column
mypred_df$prob <- rownames(mypred_df)
ggplot(data=mypred_df, aes(x=prob, y=estimate, color = prob)) +
geom_point(size = 5) +
geom_errorbar(aes(ymin = lcl, ymax = ucl, col = prob),width = 0.25) +
ggtitle("detection (p) and survival prob. (phi)") +
xlab("result") + ylab("estimate") +
theme_bw(base_size = 20) As mentioned before, we assume constant time intervals between capturing events. In reality, this is often not the case, e.g. capturing events have to be postponed due to weather conditions. Now we build a model where we take this into account.
CJS with irregular sampling intervals
cjs_m1_time <- crm(data = starlings,
# time.interval vector is the interval between each capture event.
# for 7 capture events, there are 6 intervals
time.intervals = c(1, 1, 1, 1.6, 2, 3))
# predict model
predict(object = cjs_m1_time)## $Phi
## occ estimate
## 1 6 0.688108
##
## $p
## occ estimate
## 1 7 0.8858232
Now we have a higher survival probability of 0.69 between capturing events and a lower capture probability per capturing event of 0.89
Exercise
We have sampled two different populations in urban and rural habitat. Split the data set (Hint: use [] or dplyr::filter()) and compare survival probabilities between the two habitats
Fit multiple models simultaneously
Normally, there are different hypotheses about the factors that influence the survival probability of individuals. These can be characteristics of the individuals, such as sex, age or weight, or characteristics of the environment of the individuals, such as sealing, temperature or the number of nesting opportunities. For this purpose, different models are built that take the respective factors (explanatory or predictor variables) into account.
For a flexible approach we process the data, create design data to add variables that depend on time and fit the different models.
Group data
We process the data and define the grouping variables
## ID ch sex habitat infected freq id group initial.age
## 40 40 0,0,0,0,0,1,0 Female rural Yes 1 1 5 0
## 41 41 0,0,0,0,0,1,0 Female rural Yes 1 2 5 0
## 42 42 0,0,0,0,0,1,0 Female rural No 1 3 1 0
## 43 43 0,0,0,0,0,1,0 Female rural Yes 1 4 5 0
## 44 44 0,0,0,0,0,1,0 Female rural No 1 5 1 0
## 45 45 0,0,0,0,0,1,0 Female rural Yes 1 6 5 0
Create the design data
## id occ time cohort age ID sex habitat infected freq group Time
## 1 1 1 1 6 0 40 Female rural Yes 1 FemaleruralYes 0
## 2 1 2 2 6 0 40 Female rural Yes 1 FemaleruralYes 1
## 3 1 3 3 6 0 40 Female rural Yes 1 FemaleruralYes 2
## 4 1 4 4 6 0 40 Female rural Yes 1 FemaleruralYes 3
## 5 1 5 5 6 0 40 Female rural Yes 1 FemaleruralYes 4
## 6 1 6 6 6 0 40 Female rural Yes 1 FemaleruralYes 5
## Cohort Age order
## 1 5 0 1
## 2 5 0 2
## 3 5 0 3
## 4 5 0 4
## 5 5 0 5
## 6 5 0 6
Model formulas
Build formulas for each parameter
M2
Fit a model based on the design data with constant survival and different detection probabilities between sexes.
## Number of evaluations: 100 -2lnl: 1289.126943
##
## crm Model Summary
##
## Npar : 3
## -2lnL: 1289.117
## AIC : 1295.117
##
## Beta
## Estimate
## Phi.(Intercept) 0.0793894
## p.(Intercept) 1.7877066
## p.sexMale 0.7906268
## $Phi
## occ estimate
## 1 6 0.5198369
##
## $p
## sex occ estimate
## 1 Female 7 0.8566459
## 2 Male 7 0.9294541
M3
Fit another model based on the design data with different survival and detection probabilities between sexes
cjs_m3 <- crm(
m2_proc,
m2_design,
model.parameters = list(Phi = phi_sex,
p = p_sex),
accumulate = FALSE
)## Number of evaluations: 100 -2lnl: 1283.680512 Number of evaluations: 200 -2lnl: 1283.680512
##
## crm Model Summary
##
## Npar : 4
## -2lnL: 1283.681
## AIC : 1291.681
##
## Beta
## Estimate
## Phi.(Intercept) -0.09718597
## Phi.sexMale 0.34094477
## p.(Intercept) 1.98842948
## p.sexMale 0.46984162
## $Phi
## sex occ estimate
## 1 Female 6 0.4757226
## 2 Male 6 0.5606397
##
## $p
## sex occ estimate
## 1 Female 7 0.8795769
## 2 Male 7 0.9211642
M4
Fit another model based on the design data with time-dependent survival and detection probability time is a variable that was introduced by the make.design.data() function
cjs_m4 <- crm(
m2_proc,
m2_design,
model.parameters = list(Phi = list(formula = ~ time),
p = list(formula = ~ time)),
accumulate = FALSE
)## Number of evaluations: 100 -2lnl: 1244.670707 Number of evaluations: 200 -2lnl: 1239.725494 Number of evaluations: 300 -2lnl: 1239.319519 Number of evaluations: 400 -2lnl: 1239.303701 Number of evaluations: 500 -2lnl: 1239.314667 Number of evaluations: 600 -2lnl: 1239.332245 Number of evaluations: 700 -2lnl: 1240.362774 Number of evaluations: 800 -2lnl: 1239.305679 Number of evaluations: 900 -2lnl: 1239.798211 Number of evaluations: 1000 -2lnl: 1239.31019 Number of evaluations: 1100 -2lnl: 1239.303675
##
## crm Model Summary
##
## Npar : 12
## -2lnL: 1239.304
## AIC : 1263.304
##
## Beta
## Estimate
## Phi.(Intercept) 1.5987971
## Phi.time2 -2.6225405
## Phi.time3 -1.8076489
## Phi.time4 -1.1463136
## Phi.time5 -1.3796043
## Phi.time6 -0.6965196
## p.(Intercept) 1.2003709
## p.time3 1.5729892
## p.time4 1.2288969
## p.time5 0.7978441
## p.time6 1.2524201
## p.time7 -0.2257743
## $Phi
## time occ estimate
## 1 6 6 0.7114173
## 2 5 5 0.5545798
## 3 4 4 0.6112295
## 4 3 3 0.4479760
## 5 2 2 0.2642989
## 6 1 1 0.8318502
##
## $p
## time occ estimate
## 1 7 7 0.7260348
## 2 6 6 0.9207653
## 3 5 5 0.8806095
## 4 4 4 0.9190321
## 5 3 3 0.9412192
## 6 2 2 0.7685908
Multiple models + model selection (AIC)
When testing multiple hypotheses, you can speed up model formulation for multiple models.
Caution! Models should always represent hypotheses according to biological meaningful questions, don’t just fit many models without thinking about them. Generally, formulate hypotheses before formulating models!
fit.starlings.cjs.models <- function() {
# Apparent survival (Phi) formula
Phi.time <- list(formula = ~ time)
Phi.sex <- list(formula = ~ sex)
Phi.sex.time <-
list(formula = ~ sex * time) # interaction of sex and time
Phi.habitat.time <- list(formula = ~ habitat * time)
Phi.dot <- list(formula = ~ 1) # constant survival
# Detection probability (p) formula
p.sex <- list(formula = ~ sex) # differs between males and females
p.time <-
list(formula = ~ time) # one discrete estimate of p per capture event
p.habitat <- list(formula = ~ habitat)
p.dot <- list(formula = ~ 1) # constant detection
# Construct all combinations and put into one model table
cml <-
create.model.list(c("Phi", "p")) # makes all possibile combinations of those parameter formulas
results <- crm.wrapper(
cml,
data = m2_proc,
ddl = m2_design,
external = FALSE,
accumulate = FALSE,
hessian = TRUE
)
return(results)
}Model fitting summary
## model npar AIC DeltaAIC weight
## 13 Phi(~sex * time)p(~1) 13 1251.441 0.000000 2.983796e-01
## 15 Phi(~sex * time)p(~sex) 14 1252.538 1.096923 1.724152e-01
## 5 Phi(~habitat * time)p(~1) 13 1252.975 1.534076 1.385635e-01
## 14 Phi(~sex * time)p(~habitat) 14 1253.052 1.610556 1.333648e-01
## 7 Phi(~habitat * time)p(~sex) 14 1253.221 1.780076 1.225267e-01
## 6 Phi(~habitat * time)p(~habitat) 14 1254.767 3.325259 5.658460e-02
## 17 Phi(~time)p(~1) 7 1256.353 4.911634 2.559891e-02
## 19 Phi(~time)p(~sex) 8 1256.478 5.036444 2.405023e-02
## 18 Phi(~time)p(~habitat) 8 1257.978 6.536567 1.135982e-02
## 16 Phi(~sex * time)p(~time) 18 1258.130 6.688597 1.052831e-02
## 8 Phi(~habitat * time)p(~time) 18 1259.310 7.868629 5.836038e-03
## 20 Phi(~time)p(~time) 12 1263.304 11.862290 7.923290e-04
## 9 Phi(~sex)p(~1) 3 1290.564 39.122357 9.538000e-10
## 11 Phi(~sex)p(~sex) 4 1291.681 40.239128 5.456996e-10
## 10 Phi(~sex)p(~habitat) 4 1292.362 40.920736 3.881004e-10
## 12 Phi(~sex)p(~time) 8 1294.274 42.832830 1.491895e-10
## 3 Phi(~1)p(~sex) 3 1295.117 43.675758 9.788110e-11
## 1 Phi(~1)p(~1) 2 1295.825 44.383660 6.870364e-11
## 2 Phi(~1)p(~habitat) 3 1297.634 46.193096 2.780135e-11
## 4 Phi(~1)p(~time) 7 1298.782 47.340491 1.566434e-11
## neg2lnl convergence
## 13 1225.441 0
## 15 1224.538 0
## 5 1226.975 0
## 14 1225.052 0
## 7 1225.221 0
## 6 1226.767 0
## 17 1242.353 0
## 19 1240.478 0
## 18 1241.978 0
## 16 1222.130 0
## 8 1223.310 0
## 20 1239.304 0
## 9 1284.564 0
## 11 1283.681 0
## 10 1284.362 0
## 12 1278.274 0
## 3 1289.117 0
## 1 1291.825 0
## 2 1291.634 0
## 4 1284.782 0
Time-dependent variables in CJS models
Now, we have already learned how we can incorporate static individual covariates. But covariates can also vary between sampling occasions. Here, we are interested how a binary but time-varying variable heat-wave will impact survival and detection probabilities. First, we will create a binary variable. In addition, we will see how continuous individual covariates (here: weight) can be used to describe demographic parameters. Caution, here we just assume weight to be constant across an individuals’ life (of course it is not), but we can’t document the weight of individuals that are not captured. There are ways to incorporate missing data (e.g. imputation or assigning weight classes such as small, medium, large), but we will not cover them here.
# create variable
Heat_wave <- matrix(rep(c(0, 1, 1, 0, 0, 0), each = nrow(starlings)),
ncol =6)
# rename columns
colnames(Heat_wave) = paste("Heat_wave", 1:6, sep = "")
#bind rows
starlings_heatwave <- cbind(starlings, Heat_wave)
#generate some random weight
starlings_heatwave$weight <- round(rnorm(nrow(starlings), 80, 3))
#prepare data
starlings.proc <- process.data(starlings_heatwave)
# Design matrix for survival
design.Phi <-
list(static = c("weight", "sex"),
time.varying = c("Heat_wave"))
# Design matrix for detection
design.p <- list(static = c("sex"))
# combine in one list
design.parameters <- list(Phi = design.Phi, p = design.p)
# processed data and design parameters into model design matrix
m3_design <- make.design.data(data = starlings.proc,
parameters = design.parameters)
#check the names in the design matrices
names(m3_design$Phi)## [1] "id" "occ" "time" "cohort" "age" "Heat_wave"
## [7] "weight" "sex" "Time" "Cohort" "Age" "order"
## [1] "id" "occ" "time" "cohort" "age" "sex" "Time" "Cohort"
## [9] "Age" "fix" "order"
# Define survival prob.
Phi.heat.sex <- list(formula = ~ Heat_wave * sex)
# Define detection prob.
p.sex <- list(formula = ~ sex)
# fit the model
m3 <- crm(
starlings.proc,
m3_design,
hessian = TRUE,
model.parameters = list(Phi = Phi.heat.sex,
p = p.sex)
)## Number of evaluations: 100 -2lnl: 1244.121204 Number of evaluations: 200 -2lnl: 1243.974874 Number of evaluations: 300 -2lnl: 1244.044779 Number of evaluations: 100 -2lnl: 1245.196681
## $Phi
## Heat_wave sex occ estimate se lcl ucl
## 1 0 Female 6 0.5888999 0.03365102 0.5217320 0.6529102
## 2 0 Male 6 0.6029780 0.03050136 0.5419426 0.6609683
## 3 1 Female 3 0.2874165 0.03769815 0.2194742 0.3665152
## 4 1 Male 3 0.4682230 0.04526839 0.3813970 0.5570169
##
## $p
## sex occ estimate se lcl ucl
## 1 Female 7 0.8722827 0.03817459 0.7772436 0.9304042
## 2 Male 7 0.9191637 0.02691341 0.8482694 0.9585520
More complex models & predicting
Here, we are interested in individual weight.
Phi.weight.heat <-list(formula=~weight*Heat_wave)
p.sex <-list(formula=~sex)
m4 <- crm(
starlings.proc,
m3_design,
hessian = TRUE,
model.parameters = list(Phi = Phi.weight.heat,
p = p.sex)
)## Number of evaluations: 100 -2lnl: 1252.677436 Number of evaluations: 200 -2lnl: 1255.89512 Number of evaluations: 300 -2lnl: 1252.968609 Number of evaluations: 400 -2lnl: 1252.676557 Number of evaluations: 100 -2lnl: 1257.437678
## $Phi
## weight Heat_wave occ estimate se lcl ucl
## 1 79 0 6 0.6003154 0.02366211 0.5531702 0.6456726
## 2 78 0 6 0.6030211 0.02649719 0.5501114 0.6536262
## 3 76 0 6 0.6084135 0.03577190 0.5365250 0.6758874
## 4 87 0 6 0.5784621 0.05581375 0.4670070 0.6824580
## 5 84 0 6 0.5866981 0.03700753 0.5127970 0.6568898
## 6 77 0 6 0.6057205 0.03070027 0.5442351 0.6640317
## 7 82 0 6 0.5921622 0.02713701 0.5380946 0.6440871
## 8 80 0 6 0.5976036 0.02274455 0.5523348 0.6412679
## 9 81 0 6 0.5948858 0.02399419 0.5471243 0.6409161
## 10 85 0 6 0.5839579 0.04294503 0.4981462 0.6649652
## 11 75 0 6 0.6110998 0.04137545 0.5276484 0.6885113
## 12 83 0 6 0.5894329 0.03163288 0.5263314 0.6497225
## 13 86 0 6 0.5812125 0.04925269 0.4827940 0.6735658
## 14 73 0 6 0.6164522 0.05343803 0.5078826 0.7145325
## 15 74 0 4 0.6137795 0.04730548 0.5180168 0.7014801
## 16 90 0 4 0.5701823 0.07641024 0.4186053 0.7096513
## 17 80 1 3 0.3691765 0.02984352 0.3128306 0.4293305
## 18 86 1 3 0.4125508 0.06882940 0.2869833 0.5506308
## 19 78 1 3 0.3551391 0.03739509 0.2856581 0.4313161
## 20 77 1 3 0.3482111 0.04407812 0.2674567 0.4387446
## 21 81 1 3 0.3762808 0.03081215 0.3180633 0.4383054
## 22 79 1 3 0.3621284 0.03230885 0.3014750 0.4275162
## 23 84 1 3 0.3979045 0.04992160 0.3052082 0.4985524
## 24 83 1 3 0.3906476 0.04173233 0.3125535 0.4747780
## 25 82 1 3 0.3834388 0.03507171 0.3174010 0.4540764
## 26 76 1 3 0.3413467 0.05162750 0.2483609 0.4483786
## 27 87 1 3 0.4199341 0.07901220 0.2771182 0.5775476
## 28 85 1 3 0.4052065 0.05906622 0.2964706 0.5241130
## 29 75 1 2 0.3345482 0.05961159 0.2292578 0.4593738
## 30 88 0 6 0.5757068 0.06255565 0.4509656 0.6914950
## 31 72 0 5 0.6191180 0.05969651 0.4974060 0.7275026
## 32 74 1 3 0.3278177 0.06778289 0.2106540 0.4712427
## 33 72 1 2 0.3145687 0.08415756 0.1759748 0.4965423
##
## $p
## sex occ estimate se lcl ucl
## 1 Female 7 0.8568973 0.03983628 0.7600623 0.9188251
## 2 Male 7 0.9240681 0.02511068 0.8578462 0.9608486
Okay, we get estimates for each measured weight now, this is not really useful (imagine we would measure weight more precisely and end up with many more parameters).
In this case, it is very useful to create a new data frame, and use the predict function on that data:
new_starling <- expand.grid(sex=c("Female","Male"),weight=60:80,Heat_wave1=0,
Heat_wave2=1,Heat_wave3=1,Heat_wave4=0,Heat_wave5=0,Heat_wave6=0)
# predict with the newdata option
prediction <- predict(m4,newdata=new_starling,se=TRUE)
# Instead of binary factor, use the actual name
prediction$Phi$Heat_wave = factor(prediction$Phi$Heat_wave, labels = c("No_Heatwave", "Heatwave"))
# plot data
ggplot(prediction$Phi, aes(weight, estimate, ymin = lcl, ymax = ucl)) +
geom_errorbar(width = 0.2) +
geom_point() +
geom_line() +
xlab("\nWeight") + ylab("Survival\n") + facet_grid(Heat_wave ~ .)Exercise
Some exercises:
Exercise 6.1
The data set starlings contains one column (“infected”) that represents an infection with an incurable disease (i.e. a static variable comparable to sex). Formulate hypothesis associated with this infection status (write them down!) & try to build simple CMR models. Think about why infection could matter for capture probability or survival. There are some NAs in the data set, maybe try ?na-omit() and remove some of the observations without data from all analyses.Discuss the results in one sentence
Exercise 6.2
In our previous model (m3), we saw that sex & heatwave played an important role in individual survival. Formulate a model that additionally accounts for the habitat type (hint: * or :)
Exercise 6.3 (advanced)
This exercise will require some data-wrangling, the data set is not yet prepared: In the file locations, you will find measured temperatures for each year.
(Hint: You might need dplyr::left_join() and dplyr::rename() to get all data in the right format and bind the rows according to the habitat type.)
Test how temperature affects survival, and how sex affects detection probability. Then predict the survival probabilities for extreme weather scenarios (temperatures of 30,33,35,38,41,44 degree Celsius)
Session Info
## [1] "2025-03-12 15:19:33 CET"
## R version 4.3.3 (2024-02-29 ucrt)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 19045)
##
## Matrix products: default
##
##
## locale:
## [1] LC_COLLATE=English_United Kingdom.utf8
## [2] LC_CTYPE=English_United Kingdom.utf8
## [3] LC_MONETARY=English_United Kingdom.utf8
## [4] LC_NUMERIC=C
## [5] LC_TIME=C
##
## time zone: Europe/Berlin
## tzcode source: internal
##
## attached base packages:
## [1] parallel stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] d6berlin_1.0.0 tidyr_1.3.1 readr_2.1.5
## [4] ggplot2_3.5.0 rnaturalearthdata_1.0.0 devtools_2.4.5
## [7] usethis_2.2.3 tmap_3.3-4 sf_1.0-16
## [10] skimr_2.1.5 marked_1.2.8 lme4_1.1-35.2
## [13] Matrix_1.6-5 here_1.0.1
##
## loaded via a namespace (and not attached):
## [1] RColorBrewer_1.1-3 wk_0.9.1 rstudioapi_0.16.0
## [4] jsonlite_1.8.8 magrittr_2.0.3 farver_2.1.1
## [7] nloptr_2.0.3 rmarkdown_2.26 fs_1.6.3
## [10] ragg_1.3.0 vctrs_0.6.5 memoise_2.0.1
## [13] minqa_1.2.6 base64enc_0.1-3 terra_1.7-71
## [16] htmltools_0.5.8.1 leafsync_0.1.0 truncnorm_1.0-9
## [19] s2_1.1.6 raster_3.6-26 sass_0.4.9
## [22] pracma_2.4.4 KernSmooth_2.23-22 bslib_0.7.0
## [25] htmlwidgets_1.6.4 stars_0.6-5 cachem_1.0.8
## [28] TMB_1.9.11 mime_0.12 lifecycle_1.0.4
## [31] pkgconfig_2.0.3 optimx_2023-10.21 R6_2.5.1
## [34] fastmap_1.1.1 shiny_1.8.1.1 digest_0.6.35
## [37] numDeriv_2016.8-1.1 colorspace_2.1-0 rprojroot_2.0.4
## [40] prismatic_1.1.1 leafem_0.2.3 pkgload_1.3.4
## [43] textshaping_0.3.7 crosstalk_1.2.1 labeling_0.4.3
## [46] lwgeom_0.2-14 fansi_1.0.6 abind_1.4-5
## [49] compiler_4.3.3 proxy_0.4-27 remotes_2.5.0
## [52] bit64_4.0.5 withr_3.0.0 DBI_1.2.2
## [55] highr_0.10 pkgbuild_1.4.4 MASS_7.3-60.0.1
## [58] sessioninfo_1.2.2 tmaptools_3.1-1 leaflet_2.2.2
## [61] classInt_0.4-10 tools_4.3.3 units_0.8-5
## [64] httpuv_1.6.15 R2admb_0.7.16.3 glue_1.7.0
## [67] nlme_3.1-164 promises_1.3.0 grid_4.3.3
## [70] generics_0.1.3 gtable_0.3.4 tzdb_0.4.0
## [73] class_7.3-22 rmdformats_1.0.4 data.table_1.15.4
## [76] hms_1.1.3 sp_2.1-3 xml2_1.3.6
## [79] utf8_1.2.4 pillar_1.9.0 stringr_1.5.1
## [82] vroom_1.6.5 ggspatial_1.1.9 later_1.3.2
## [85] splines_4.3.3 dplyr_1.1.4 lattice_0.22-6
## [88] bit_4.0.5 tidyselect_1.2.1 miniUI_0.1.1.1
## [91] knitr_1.46 bookdown_0.38 svglite_2.1.3
## [94] xfun_0.43 expm_0.999-9 stringi_1.8.3
## [97] yaml_2.3.8 boot_1.3-30 kableExtra_1.4.0
## [100] evaluate_0.23 codetools_0.2-20 tibble_3.2.1
## [103] cli_3.6.2 xtable_1.8-4 systemfonts_1.0.6
## [106] repr_1.1.7 munsell_0.5.1 jquerylib_0.1.4
## [109] dichromat_2.0-0.1 Rcpp_1.0.12 coda_0.19-4.1
## [112] png_0.1-8 XML_3.99-0.16.1 ellipsis_0.3.2
## [115] profvis_0.3.8 urlchecker_1.0.1 viridisLite_0.4.2
## [118] scales_1.3.0 e1071_1.7-14 crayon_1.5.2
## [121] purrr_1.0.2 rlang_1.1.3